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Abstract 

The approach to the ergodic limit in Monte Carlo simulations is studied using both analytic 
and numerical methods. With the help of a stochastic model, a metric is defined that enables the 
examination of a simulation in both the ergodic and non-ergodic regimes. In the non-ergodic regime, 
the model implies how the simulation is expected to approach ergodic behavior analytically, and the 
analytically inferred decay law of the metric allows the monitoring of the onset of ergodic behavior. 
The metric is related to previously defined measures developed for molecular dynamics simulations, 
and the metric enables the comparison of the relative efficiencies of different Monte Carlo schemes. 
Applications to Lennard- Jones 13-particle clusters are shown to match the model for Metropolis, 
J-walking and parallel tempering based approaches. The relative efficiencies of these three Monte 
Carlo approaches are compared, and the decay law is shown to be useful in determining needed high 
temperature parameters in parallel tempering and J-walking studies of atomic clusters. 
PACS numbers: Q5.10.Ln, Q2.70.Lq 



I. INTRODUCTION 

A goal of Monte Carlo (MC) simulations in statistical 
mechanics ||l[ is the calculation of ensemble mean values 
of thermodynamic quantities. Ensemble mean values are 
multidimensional integrals over configuration space 

(U) = ydxP(x)C/(x) , (1) 

where P(x) is the probability of finding a system in the 
state defined by x, and the functional form of P(x) de- 
pends on the ensemble investigated. MC simulations usu- 
ally generate a sampling of configuration space {xfc}^-^ by 
the use of a stochastic process with stationary probability 
P(xfe). The quantity U evaluated at x^ is the output of 
the simulation [/(xfe) = Uk, and its arithmetic mean value 
[/, in principle, must approach the ensemble mean value. 
||l[ In this paper we refer to the set of configurations gener- 
ated in a Monte Carlo simulation as a time sequence, and 
we study the behavior of these temporal sequences {Uk} 
and their arithmetic mean, to understand better how MC 
simulations approach ergodic behavior. It is important to 
emphasize that there are two time variables to consider. 
The time variable k labels the separate configurations gen- 
erated in a Monte Carlo walk. Variations of properties 
with k provide information about the short-time behavior 
of a MC simulation. The time variable K labels the total 



length of the MC walk, and variations of computed prop- 
erties with K provide information about the convergence 
of the simulation on a long time scale. 

Given an infinite time, the stochastic walker in a MC 
simulation visits every allowed point in configuration 
space. 1^ Ergodic behavior is reached when the length 
of the walk is sufficiently long to sample configuration 
space appropriately. [|j In practice, this does not mean 
that the space has been densely covered but that every 
region with non-negligible probability has been reached. 
In such a case we can say that the simulation is effectively 
ergodic or that it has reached the ergodic limit. 

For a finite walk, in the event of broken ergodicity , 
phase space is effectively disconnected. The different dis- 
connected regions (called components) are separated by 
barriers of zero effective probability. If a stochastic walker 
starts its walk in one of these regions, it may not cross the 
barriers within the time of the simulation. If the simula- 
tion length is increased, some barriers may become acces- 
sible for the walker and phase space is better sampled. We 
can conclude that a time r exists such that, for simulation 
lengths shorter than r, the walker becomes trapped in one 
of the phase space components. For simulation lengths 
much larger than r, phase space is effectively covered by 
the walker. 

In this study we imagine a system having more than one 
time scale ri <C T2 <C . . . <C ta. In a Monte Carlo sim- 
ulation each scale comes from stochastic processes with 
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different correlation times. ||] A precise definition of the 
correlation times for Monte Carlo processes is given in 
Section III, but for the moment we can think of these cor- 
relation times as identical to physical time scales of the 
system under study. To understand these time scales more 
fully, it is useful to focus on an example. Prototypical of 
systems having such disparate time scales are atomic and 
molecular clusters. Typical cluster potential surfaces have 
many local minima separated by significant energy barri- 
ers. [^-^ The local minima can be grouped into basins 
of similar energies, with each basin separated from other 
basins again by energy barriers. At short Monte Carlo 
times a cluster system executes small amplitude oscilla- 
tions about one of its potential minima. We can think of 
these vibrational time scales as the shortest time scales 
that define a cluster system. As the simulation time is 
extended the system eventually hops between different lo- 
cal minima within the same basin. The time scale for 
the first hops between local minima can be considered 
the next shortest time scale for the simulation. At still 
longer Monte Carlo times, the system hops between dif- 
ferent energy basins defining yet another time scale for the 
simulation. This grouping of time scales continues until 
the longest time scale for a given system is reached. At 
Monte Carlo times that are long compared to this longest 
time scale, the simulation is ergodic. 

Consider a system with several time scales as mentioned 
above. If the length of the simulation is smaller than the 
smallest correlation time, the walker may become trapped 
in an effectively disconnected region and the sampling of 
phase space is incomplete. By increasing the time, the 
memory of the initial condition in the sampling decreases 
as the walker crosses to other previously unreachable re- 
gions. These oscillations and hoppings can be modeled by 
a superposition of stochastic processes with different cor- 
relation times. These processes with non-zero correlation 
times are known as colored noise processes (as opposed 
to zero correlation time white noise processes). |5[| From 
the study of the autocorrelation functions of a stochas- 
tic model defined using these colored noise processes, we 
can verify that, at a fixed run length K , there exist two 
different groups of processes; those that contribute to the 
autocorrelation function with terms that decay like 1/k 
(called diffusive processes), and those that contribute to 
the autocorrelation function with terms that decay slower 
than 1/k (called non-diffusive processes). When the time 
of the simulation is increased, some non-diffusive processes 
at shorter run lengths, start to contribute to the autocor- 
relation function like diffusive processes. After the walk 
length reaches the largest correlation time ta, all processes 
contribute to the autocorrelation function with terms that 
decay like 1/k. At this point, the simulation is at the dif- 
fusive regime and effective ergodicity has been reached. 
A principal goal of this work is to investigate the way in 
which the MC output {Uk} reaches the diffusive limit (i.e. 
the ergodic limit) by studying the properties of autocorre- 
lation functions under changes of scale in time, K bK 
with h > 1. By time scaling it is possible to infer the decay 



law of the non-diffusive contributions with respect to the 
total simulation time K. The functional dependence of 
the non-diffusive contributions on the parameter 6 that is 
used to scale K is determined empirically. We have found 
the decay law so determined to be a particularly valuable 
method of concluding when a simulation can be considered 
ergodic. Unlike previous studies PPHTl|] that only have 
investigated the behavior of certain autocorrelation func- 
tions in the ergodic regime, by focusing on the approach 
to ergodic behavior we have a more careful monitor of the 
onset of ergodicity. Once the non-diffusive contributions 
have decayed to a point where they arc too small to be 
distinguished from zero to within the fluctuations of the 
calculation, we can say that the ergodic limit has been 
reached. 

The autocorrelation functions we use to measure the 
approach to the ergodic limit are based on one of the 
probes of ergodicity developed by Thirumalai and co- 
workers P,p|-pl| , and is often called the energy metric. 
The energy metric has been proposed as an alternative to 
other techniques ||^ (like the study of the Lyapunov expo- 
nents fl^ ) for the study of ergodic properties in molecular 
dynamics (MD) simulations. The metric has been used to 
study the relative efficiency of MC simulation methods 
as well, jis) The MC metric as used in the current work 
can easily be extended from the energy to other scalar 
observables of the system. 

We present two key issues in this paper. First, from the 
knowledge of the decay law of the non-diffusive contribu- 
tions to the MC metric, we infer how long a simulation 
must be to be considered effectively ergodic. Second, once 
the ergodic limit is reached, we can compare the results 
from different numerical algorithms to measure relative 
efficiencies. Because the outcomes of MC simulations are 
noisy, we have found it useful to separate diffusive and 
non-diffusive terms in the MC metric with a Fourier anal- 
ysis so that we can neglect the high frequency compo- 
nents of the noise. This technique has given reproducible 
results. 

To test the match between the stochastic model and ac- 
tual Monte Carlo simulations, we examine the approach 
to ergodic behavior in simulations of Lennard- Jones clus- 
ters. Recently we have studied the thermodynamic 
properties of Lennard- Jones clusters as a function of tem- 
perature using both J-walking p6| and parallel temper- 
ing methods. |[r^-[T9[| Both simulation techniques require 
an initial high temperature that must be ergodic when 
Metropolis Monte Carlo methods are used. If the 
Metropolis method does not give ergodic results at the 
initial high temperature, systematic errors propagate to 
the lower temperatures in J-walking and parallel temper- 
ing simulations, and the results can be flawed or mean- 
ingless. In most Monte Carlo simulations of clusters at 
finite temperatures, |^,^ the clusters are defined by en- 
closing the atoms within a constraining potential about 
the center of mass of the system. The constraining po- 
tential is necessary because clusters at finite temperatures 
have finite vapor pressures, and the association of any one 
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atom with the cluster can be iU-defined. From experience 
we have found that if the radius of the con- 
straining potential and the initial high temperature are 
not both carefully chosen, it can be difficult to attain er- 
godicity with Metropolis methods. A key concern then is 
the choice of constraining radius and the choice of initial 
temperature. We verify the stochastic model by investi- 
gating Monte Carlo simulation results as a function of the 
temperature and the size of the constraining potential. 

The contents of the remainder of this paper are as fol- 
lows. In Section II we motivate the studies that follow by 
examining numerally the behavior of a set of Monte Carlo 
simulations of a 13-particle Lennard-Jones cluster. This 
cluster system is used to illustrate the results through- 
out this paper. In Section III we introduce the stochastic 
model based on a continuous time sequence. In Section 
IV we extend the model to discrete time sequences char- 
acteristic of actual Monte Carlo simulations. In Section V 
we test the discrete stochastic model with applications to 
Lennard-Jones clusters and in Section VI we summarize 
our conclusions. Many of the key derivations needed for 
the developments are found in two appendices. 



II. AN EXAMPLE CALCULATION 

Before discussing the major developments of this work, 
it is useful to understand the nature of the problem we 
are attempting to solve by examining some numerical re- 
sults on a prototypical system. We take the 13-particle 
Lennard-Jones cluster defined by the potential function 
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i=2 j = l 
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where e and a are the standard Lennard-Jones energy 
and length parameters, N is the number of particles in 
the cluster (13 in the present case), rij is the distance 
between particles i and j 

and Vc is the constraining potential discussed in Sec. I 
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where Xc is the coordinate of the center of mass of the 
cluster and Rc is the radius of the constraining sphere. 
The 13-particle Lennard-Jones cluster has a complex po- 
tential surface with many minima separated by significant 
energy barriers, 1^-^ and ergodicity problems associated 
with the simulation of properties of this system are well- 
known. We now consider a Metropolis MC simula- 
tion of the average potential energy of the system in the 



canonical ensemble at temperature kgT/e = 0.393(A:b is 
the Boltzmann constant). This average potential energy 
Vk is defined by 



k 



(5) 



and is displayed in the upper panel of Fig. 1 as a func- 
tion of the walk length k for 20 independent simulations 
each initialized from a random configuration. Over the 
maximum time scale K of the walks, it apparent that the 
potential energy averaged over each independent walk has 
not converged to the same result. Such unreproducible 
behavior is symptomatic of a simulation not yet at the 
ergodic limit. 
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Fig. 1: The upper panel shows the "time evolution" of Vfe (in 
units of e) for M = 20 independent experiments. The lower panel 
shows df; (in units of e^) vs. k for the experiments of the upper panel. 
Rc has been set to 4cr and kgT/e = 0.393. At least two basins with 
different energies are present. Clearly, df, goes to a constant when 
k is increased within the total time scale of the simulation. 

At the ergodic limit (i.e. for the maximum walk length 
K greater than that included in Fig. 1) the averages dis- 
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played in the upper panel of Fig. 1 must approach the 
same value for each walker. Using related ideas developed 
elsewhere, the extent to which the walks approach 

the same limit can be measured in terms of a metric dk 
defined by 



2 " r 
M(M - 1) ^ ^ [ 

^ > i=2 7 = 1 



(6) 



In Eq. M represents the number of independent walks, 

and Vj^. is the average potential energy computed in walk 
i at MC time k. The metric measures the energy fluctu- 
ations in the walk as a function of the walk length. For 
an ergodic simulation, the metric must decay to zero. For 
the 20 simulations of the 13-particle Lennard-Jones clus- 
ter, the metric as a function of k is plotted in the lower 
panel of Fig. 1. Rather than asymptotically approaching 
zero, over the short length of the walk displayed here, dk 
has decayed to a constant, and as discussed later in this 
paper, over the time scale of this simulation, dk can be 
qualitatively represented by the function 



k 



Bk 



(7) 



where Ak and Bk are coefficients that are dependent on 
the total walk length K. As K is increased to a time 
where the walk is ergodic, Bk must decay to zero. Major 
goals of this work are to understand how Bk decays and 
to use the discovered decay law to determine the onset 
of ergodic behavior. Our approach is to introduce first a 
continuous stochastic model of a simulation followed by a 
discrete model more clearly linked to actual MC studies. 



III. STOCHASTIC MODEL 



U{t) = Uc + + J2^^9eit/Te 



(8) 



where Uc is a constant, the random variable ^(t) represents 
white noise processes with zero correlation time (tq = 0), 
and the {ge{t/Tg)} are stochastic processes with correla- 
tion times Ti > 0. ^{t) and gi{t/Ti) have units of the 
square root of time, and Fq and F^ are constants with 
units of U'^/t. If U is chosen to be the the x-coordinate 
of a particle, Fg and F^ have units of a diffusion constant. 
Consequently we refer to these constants as generalized 
diffusion coefhcients. The white noise process has the fol- 
lowing properties 



m) = 



(9) 
(10) 



and the remaining colored noise processes are assumed to 
satisfy 



{g,{t/T,))=Q 
{9i{t/'rt)gi{t' /n)) = — ft [- — ^— 



(11) 

(12) 



so that they represent processes with a memory fg. Even 
though correlations between processes with different cor- 
relation times may be non-zero, we assume the processes 
to be independent, i.e. 

{gi{tlTg)gi,{s,' In,)) - {,gt{iln))(^gt(i! In,)) 

= V^T^f (13) 

m)g^{t' In)) ^ m){gi{t' In)) 

= V tandf . (14) 



We have discussed in the introduction how the output 
of MC simulations can be considered to be a combination 
of stochastic processes with different time scales, and how 
the contributions to autocorrelation function from these 
processes can vary when the length of the simulation is 
enlarged. Here we present a continuous time model for 
the stochastic processes that occur in a simulation. Even 
though a MC simulation occurs in a discrete time (each 
MC point represents a time unit), we find that the con- 
tinuous model helps to understand better the ideas used 
in the modeling of the MC output. 

In this section the ensemble mean value is used to find 
the expression for the autocorrelation functions of the 
model. Although in actual numerical calculations the en- 
semble mean is replaced by a mean over a finite number 
of independent experiments, the results obtained here give 
information about the limit of an infinite sample. 

The stationary process used to sample space is a 
stochastic process. We assume the output of the MC simu- 
lation can be modeled by a linear superposition of stochas- 
tic processes with different correlation times Tg > 0, 



The memory function is assumed to be a continuous func- 
tion that depends only on the distance between t and t' 
disregarding the time origin (stationary condition). The 
memory function represents the correlation between two 
times of the process gg. In our model we impose the con- 
dition 

-h (-) < fdt' -fg (-\<oo. (15) 
n \nj Jo Tg \nj 

The scope and implications of the leftmost inequality are 
explored in Appendix A. In Appendix A we also examine 
the conditions fg must satisfy in order to yield contribu- 
tions to autocorrelation function that decay more weakly 
than lit. We now assume that this inequality can be 
taken as a bound to possible maxima of fg appearing at 
t > 0. The rightmost inequality enables us to assume fg 
is normalized 

r*i/.(^)^K ,«) 

J-oo n \nj 
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Wc have identified here the time scale ti with the correla- 
tion time of the stochastic process gi. This identification 
is valid if 



f 



\t\ 



n 



dt —h — = > 



(17) 



which implies that the behavior of at large t must be 
O oj. smaller. 

In addition, by the properties of the ensemble mean 
value, we have that for all real A 



0< + \9i{iln)X 

< {9t{t/nf) + 2A {gt{t/n)g,{t'/n)) + \^ {gt{t'/nf) 



< 



Ti 



\t~t'\ 



n 
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Equation ( p^ must be true for all A. Therefore, the dis- 
criminant of the polynomial in A must be non-positive 



\t-t'\ 



MO) 



fi 



\t-t' 



fiiO) 



< . 



(19) 



Consequently, fe{0) — nia.x{fe{x)V x > 0}. Other proper- 
ties of fg are studied in Appendix A. 

The ensemble mean value (U) is time independent. The 
ensemble mean value of the noise processes is zero. There- 
fore, Uc must be equal to ([/). Processes defined by Eq. 



(^) have two different components, uncorrelated white 
noise and correlated processes with correlation time ti. 
Because the goal of the simulation is the calculation of 
the ensemble mean (U) by the analysis of the time series, 
we study the behavior of the temporal mean U{t) 



U{t) = - dt' U{t') 
t Jo 

1 1 ^ 

= {U) + -Vf'oW{t) + -J2V^iGi{t/n) , (20) 



where W{t) is a Wiener process. 



Wit) - / dt' m 



(21) 

{W{t)) = (22) 
{W{t)W{t')) ^t^ , (23) 

with i< = min(t,i'), and Ge{t/Te) = J^dt' gi,{t' /ti). 

Equation (20) implies that the evolution of the tem- 
poral mean U (t) has the same structure as U , with an 
uncorrelated term and terms with tailed correlation func- 
tions. 

The autocorrelation function of the process U at times 
t and t' is defined by 



K{t,t')^{{U{t)-{U)) {U{t')-{U))) 



To 
tt' 



{W{t)W{t')) + l^J2^g {Ge{t/n)Gg{t'/n)) , 



(24) 



where we have used Eqs. (gj|) and (|14| ) to neglect terms involving processes with different correlation times. 

Because we have assumed fe is a continuous function, fe reaches its maximum and minimum value within any closed 
interval considered. The ith non-diffusive contribution to K(t, t') 



l^{Ge{t/n)Geit'/n)) = ^ / dt. 



dt2 —fe 



1 , 



(25) 



is bounded 



t<t^. 



dt^ 



dt2 — fe 



Te 



Te 



<\,{Ge{t/Te)Gi{t'/Ti))< ^ 



tt' 
1 



i<i> Jo 
1 



dt^ 



< -j{Geit/Te)Gfit'/n)) < -/,(0) 
tt' Te 



dt2 -fe{0) 
Te 



(26) 



where t> — max(t,<'), and tmin is the time at which fi 
reaches its minimum value in the closed interval [0,i>]. 
There exists a t}{ty) £ [0,trnm] Q such that. 



l^{Giit/Te)Geit'/Te)) = ^Je(^ 



t*eit>) 

Tl 



Using Eqs. (pS) and (E^ in (24), wc find that 



(27) 



^(t,0 = T^ 



A p 

Tl 

l=\ 



Te 



(28) 



For all times shorter than ri the autocorrelation function 
is the sum of diffusive contributions (proportional to 1 /t) 
plus non- diffusive contributions. These contributions im- 
plicitly depend on t> through t^{ty). We assume that fe 
satisfies the conditions stated in Appendix A, so that the 
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dependence of fi on t is weaker than 1/t (for total time 
scales shorter than r^; see Appendix A). 

We next consider the behavior of Eq. ( p8| ) for time 
scales greater than ri. Under the scale change t ^ bt 
such that Ti <^ bty <^ T2, the contributions to the cor- 
relation function from the process with correlation time 
Ti can be considered diffusive [in other words, by virtue 
of Eqs. (p^ ) and /i/ti has become a delta func- 

tion]. With bty ^ T2, the other processes preserve their 
old properties. Then, the autocorrelation function can be 
expressed 



i: k/re ' 



\k^k'\ 



(33) 
(34) 



The true ensemble average {U) does not depend on the 
index m. 

In the discrete case we define a metric 



2 r 
M{M - 1) ^ ^ ^ 



(35) 



K{bt, bt') 



Ti 



bU 



(29) 



The complete derivation of Eq. (g9|) can be found in Ap- 
pendix B. For a times larger than the correlation time ta, 
all contributions to the autocorrelation function are diffu- 
sive, the simulation can be considered ergodic, the sam- 
pling complete, and the temporal mean is equal to the 
ensemble mean within C'(l/<) mean square fluctuations. 



where the bars represent the temporal mean value 



jj{m) _ 1 



with 



(m) 



fc' = l 



= {U) + 



_0 yy(m) 



1=1 



(36) 



IV. DISCRETE TIME SEQUENCES AND THE MC 
METRIC 

Monte Carlo simulations generate discrete sequences Uk 
of values of the quantity under study. Additionally, in ac- 
tual calculations the ensemble of sequences is represented 
by a finite rather than an infinite set. In this section, the 
model developed in the previous section is extended to 
finite sets of discrete sequences. We express the M se- 
quences 1 1^^™'' [ , where the label (m) ranges from 1 

to M . The exact ensemble mean value ([/) can be ob- 
tained in the limit that M becomes infinite. In analogy 
with the model developed in Section III, each output is 
assumed to have the form 



where 



(™)\ 



(30) 



(31) 
(32) 



(m) 



(m) 



k' = l 
k 



G 



(m) 
t, k/rit 



: k' /ti 



(37) 
(38) 



k' = l 



Observe that in the present case, our finite sample of the 
infinite ensemble is the set of outcomes from M indepen- 
dent numerical experiments. The metric we have defined 
in Eq. (|3^) can be contrasted with alternative metrics 
Pj9|,p^ previously defined for molecular dynamics simula- 
tions. These alternative metrics examine the fiuctuations 
of two simulations initialized from different components of 
configuration space averaged with respect to all the par- 
ticles in the system. The metric we use in this work is de- 
termined using an average with respect to M independent 
simulations that represent a subset of the full ensemble. 

Using the model presented in Eq. (pO|), we now develop 
a way to predict the behavior of the MC simulation in 
the non-ergodic and the ergodic regimes. We first con- 
sider the case that the total simulation time K is larger 
than the first correlation time ti but shorter than T2, i.e. 
Ti <^ K T2. The expression for dk is given by 



M i-1 



M(M 

^ ' i=2 j = l 



M 
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M{M ■ 
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A M Tj/(i) Mi) A l-l M Mi) Mi) 



l=\ i=\ l=2l' = \ 1=1 

. M i-1 

^ ' i=2 j=l 

If the number of experiments M is sufficiently large, we can neglect terms involving processes with different correlation 
times, and products of sequences belonging to different experiments. Under these assumptions we obtain 

4. = 2^^y^ + 2^^v^^-^/-- +2yr,^v . (40) 

i=l i=l 1=2 1=1 \ / 



Equation (BO) preserves the form of Eq. 



2|). To make 



this statement explicit, let us rewrite Eq. (40) as 



k 



where 



1 *^ w^''^ 1 

ro^^^E^ + Ti^^E 

j=i 



i=l 



M 

«=2 i=l 



(41) 

(42) 
(43) 



In Appendix B we present a study of the way non-diffusive 
contribution become diffusive under time scale changes. If 
M is sufficiently large and ti K <^ T2, by virtue of Ap- 
pendix B, Ffc must roughly be a constant. By roughly a 
constant we mean a constant C plus some rapidly fluctuat- 
ing function C^, with the following properties: a) (^fc) = 
and b) |C| > maxfc=i_2,...,_ft:(|Cfc|)- Then 



Tfe — + Cfc 



(44) 



If K is enlarged, we expect to have a larger value of Tk- 
Tfc is a quantity related to the memory functions fi with 
correlation times ti ^ K. In the continuous time model, 
the colored noise processes contribute to the autocorre- 
lation function with terms proportional to /^(f^(t>)/Tf), 
which are weakly dependent on t (see Appendix A). We 
can expect Tfc to be weakly dependent on fc, and for se- 
quences of length K and for M sufficiently large, we con- 
sider this quantity roughly to be a constant 



K 



-13k 



(45) 



where (3k represents additional random noise. Then, for 
a given length k < K, the MC metric dk can be approxi- 
mated by 



dfc = 2 2 T 



k 



K 



Ik 



(46) 



where 7fe = 2(C,k/k -f Pk) represents remaining stochas- 
tic noise from both contributions. In this approximation, 



T K and T k are the quantities that carry the long time 
dependence. Short time features appear in the 1/fc de- 
pendence and in the remaining noise 7^. If the sequences 
considered are increased in size by a factor of 6, such that 
TA-i < ii: < TA < 6if for a given 1 < A < A, E^ {T k) 
is increased (decreased) (see Appendix B). Then, 



dfcfc = 2 



i bK 
'bk 



2T 



bK 



Ibk 



(47) 



where T^if must go to zero and T^k must approach a 
constant when h is increased. By virtue of the expected 
behavior of the non-diffusive contributions (see Appendix 
A), we propose the following expression for T^/f 



(48) 



where (j){b) is a decreasing function of h. Moreover, TbK 
is a sum of non-diffusive contributions. As presented in 
Appendix A, each non-diffusive contribution to the auto- 
correlation function has a relative variation smaller than 
the relative variation of the diffusive contribution, namely 
1 — 1/6. If this inequality is applicable to the sum of non- 
diffusive contributions, we have that 



1 



> 1 



bK 



T 



K 



1 - - > 1 - m 

for all 6 > 1. Then, (f) must be either 

m = b-- 

or 

m 



77 In(fo) -t- 1 



(49) 

(50) 
(51) 



(52) 



(53) 



with < u < 1 and < 77 < 1. Equation ( p3| ) can be 
thought as the limit of Eq. ( p2[ ) when the exponent goes 
to zero. We know of no a priori argument to justify Eq. 
(E3). However, as is discussed in Section V, our numerical 
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experience has shown Eq. (48) to be obeyed in ah cases 
we have examined. 

Our goal is to develop a criterion to decide when the 
simulation can be considered ergodic. From the previ- 
ous considerations it is clear that the ergodic limit is 
reached when Tk is indistinguishable from zero. The 
output from a MC simulation is usually noisy. There- 
in 



fore, 7fe can not be neglected. A useful way to separate 
diffusive and non-diffusive contributions and to eliminate 
the stochastic noise from Eq. (^), is to perform a Fourier 
analysis of the function kdk ■ Let us define the frequencies 
ujn = {2tt/ K) n, with n — Q^\, . . . ,K — 1. The discrete 
Fourier transform of the function kdk is the signal Yi<-(w„) 



>k(w„) = kdk{Ldn) = X! exp(-iw„fc) kdk 



(54) 



fc=i 



K 



K 



= — ^ exp("iLj„fc) Tk + ^ X! ^ cxp(-itj„fc) Tk + kjki^n) 

k=l k=l 

= 2 (5„,o Tk + {SnAK + 1) + (1 - 5„,o) (1 + « cot(tj„/2))} Tk + k^k{uJn) 
= 25nflTK + {KSnfl + 1)Tk + i [l- <5„,o) cot(a;„/2) Tk + kjk{(^n) 



(55) 



In general, k^ki'^n) is negligible except at high frequen- 
cies. For small positive values of the frequency we can 
make the approximation cot(a;„/2) ~ 2/cl;„. From this 
approximation we have 

Im(FKK))- — T/^ . (56) 

The real part of Eq. (|5|) for positive frequencies is 

Re(yK(co„)) = Tk . (57) 

Even though simpler than Eq. (|5^), we have found Eq. 
( ^7|) is more sensitive to the deviations of dk from the ap- 
proximation Eq. (p6[). Therefore, the data obtained from 
the real part is of poorer quality than the data obtained 
from the imaginary part. 

Equation ( pq ) implies that for a given simulation length 
K, the contributions to the MC metric from the non- 
diffusive process can be determined from a simple rela- 
tionship involving the Fourier transform of the function 
kdk at low frequencies. By increasing the length of the 
run X by a factor of 6, it is possible to observe the depen- 
dence of ThK on hK . 



V. APPLICATIONS 

The concepts developed in the previous sections are suf- 
ficiently general to be applied to any kind of MC sim- 
ulation. We devote the present section to the applica- 
tion of the developments of this paper to the study of the 
Lennard- Jones 13-particle cluster in the canonical ensem- 
ble. This system has been introduced previously in Sec. 
II. 

Some thermodynamic properties of clusters as a func- 
tion of temperature exhibit rapid changes that are remi- 
niscent of similar changes that occur for the same proper- 
ties in bulk systems at phase transitions. In a bulk system 
a phase transition occurs at a single temperature. For 



clusters the rapid changes in thermodynamic properties 
occur over a finite temperature interval. To distinguish 
the temperature range where thermodynamic properties 
change rapidly in clusters from a true phase transition, 
we follow Berry et al. |2^] and refer to such changes in 
physical properties as associated with a phase change. A 
common property that has been found to be useful in mon- 
itoring these phase change intervals of temperature is the 
heat capacity at constant volume [ p6[ 

Cv{T)^^^{{V-{V)Tf)^ + \NkB , (58) 

where {■)t represents the classical canonical mean value. 

In this work we consider the bare Metropolis (Met), |20| 
J-walking (Jw), ^ and parallel tempering (PT) 0-19| 
approaches to Monte Carlo simulations. The free variable 
of all these methods is the reduced temperature ksT/e. 
In PT and Jw simulations, the highest temperature used 
(Th) must be sufficiently large to ensure that Met is er- 
godic. From experience simulating a variety of sys- 
tems, we have found that Th must also be lower than a 
temperature Tf, where cluster evaporation events become 
frequent. It is useful to think of Ti, as the cluster ana- 
logue of a boiling temperature. We have found that Met 
is unable to sample the boiling phase change region for 
clusters ergodically, using total time scales accessible to 
current simulations. 

For the results that follow, ujT^'^ is chosen to be repre- 
sented by the potential energy of the system. In general 
t/^.™^ can be any scalar property of the system. We define 
a pass to represent a set of single particle MC moves taken 
sequentially over the 13 particles in the cluster. We take 
J/^™'' to be the potential energy at the fc'th pass, in the 
m'th experiment. Using Eq. ( p5[ ) we can write 

YK{0) = 2TK + iK + 1)Tk . (59) 

In the non-ergodic regime, Yk{0) grows with K, while in 
the ergodic regime, the signal ^^-(0) approaches a con- 
stant. 
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We begin by displaying results obtained for a calcula- 
tion that has not attained ergodicity over the time scale 
of the simulation. We examine the 13-particle Lennard- 
Jones cluster with the Met algorithm setting Rc = ia at 
a temperature of kBTh/e — 0.393. The temperature is 
chosen to be that typically used as the initial high tem- 
perature in Jw and PT studies of LJ13. By choosing a 
large constraining radius, the evaporation events are so 
frequent at the chosen temperature that attaining ergodic- 
ity proves to be quite difficult. We demonstrate the effect 
of reducing the constraining radius shortly. 



200000 



150000 - 



£ 100000 - 



50000 - 



the "time evolution" of the temporal mean values of 15 
experiments. 



12.0 




4000 6000 
K 



10000 



Fig. 2: The upper panel is tlie signal lx(0) (in units of e^) vs. 
K for Rc = 4(T at ksTh/e = 0.393. from M = 40 independent ex- 
periments, of LJ13. The length of the simulation is 10^ MC passes. 
The lower panel shows the "time evolution" of U k (in units of e) 
for 15 independent experiments. At least three basins with different 
energies are present. Clearly, the simulation at this scale of time, is 
not ergodic. 

The number of replicas used in the calculation is M — 40, 
and K — 10"*. The upper panel of Fig. 2 shows the sig- 
nal lif (0) [evaluated using Eq. (p4[)], which grows along 
the entire simulation. This is the behavior expected in 
the non-ergodic regime. In the lower panel we can see 




log, (b) 

Fig. 3: Upper panel: Ti,if (in units of as a function of log2(6) 
for Rc = 4(T, 3(T, 2.5(T, and 2a. For the two larger radii the full line is 
the best fit to the data points, according to Eq. ( ^^ with </> defined 
in Eq. (^^. The lower panel shows the linear behavior of ^ k I'^bK 
vs. log2(6), for Rc = 4(7 and 3(7. K has been set to 10^. 

There are three sets of curves, each of which is indicative 
of sampling of at least three different energy basins. At 
low values of K the curves in the lower panel differ signif- 
icantly. At if ~ 4 000 the high energy basin curves begin 
to decrease in energy. For a value of K larger than the 
data displayed in Fig. 2, the curves can be expected to 
coalesce with the low energy basin curves. It is clear that 
for K < 10 000, the simulation is not ergodic. 

In PT and Jw studies it is essential that the initial high 
temperature walk be ergodic. Ergodicity can be attained 
for LJ13 by reducing the radius of the constraining poten- 
tial so that evaporation events are rare. We now present 
a study of Tk as a function of K for several values of 
Rc- To determine Tk, we have calculated the Fourier 
transform function yR-(w„) using Eq. (^) at a series of 
frequencies ujn = 2Tm/K where n has ranged from 1 to 
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min(-\/T26AY207r, 100). This range of frequencies ensures 
the hnear approximation used in Eq. (pq ) is vaUd while 
including sufficient numbers of points for accuracy. p7| 
Using Eq. (|6|), we have calculated the slope of the imag- 
inary part of l/YR-(wn) as a function of a;„, for these fre- 
quencies. The data points appearing in Fig. 3 are the 
mean value over twenty independent calculations of the 
slope of 1/Yft:(ijj„). 



1.50 




1 2 3 4 5 6 

log, (b) 

Fig. 4: Ti,;^; (in units of e^) and its error vs. Iog2(fe) for 
Rc = 2.5a and 2a, with K = 10*. When T^x is on the order of its 
own error, the simulation can be considered ergodic. For Rc = 2a 
the simulation becomes ergodic at log2(6) ~ 4 (bK ~ 16 X 10*). For 
Rc = 2.5a a longer simulation is needed to reach ergodicity. 

Starting from random configurations, wc have per- 
formed 5 X 10" Met passes at ksTh/e = 0.393. After 
this warmup process, we have created sequences of sizes 
hK = 10", 2 X 10", 4 X 10", . . ., 64 X lO". The results are 
presented in Fig. 3 for Rc — 4(t, 3(t, 2.5f7, and 2(T. The 
upper panel shows Tf,x a function of log2(6), for fixed 
K = lO". We have chosen to present the data using base 
2 logarithms for clarity (each increase by 1 unit of log2 [h) 
represents a factor of 2 scale increase). All the data de- 
crease with increasing b, but only Rc — 2a and Rc — 2.5(7 
appear to vanish to within the error bars over the time 
scale of the current simulation. In the lower panel we 
present T K/^bK as a function of log2(6) for Rc = A and 



3cr. The decay law suggested in Eq. (^) with cf) given by 
Eq. ( |53|) is satisfied for both radii. 

2.0 I 




log, (b) 

Fig. 5: The upper panel shows the decay behavior of Tj/^ (in 
units of s^) as a function of log2(fe) for PT and Met, at the tem- 
perature of the melting peak of the heat capacity, kgTm/s = 0.282. 
From Eq. (|52[), we plot log2 {"^ K /"^bK) vs. log2 (b), to extract the 
value of the exponent v (the slope of the linear fit) . We have found 
V = 0.93 ± 0.03 for PT, and v = 0.94 ± 0.02 for Met. The straight 
lines are the best linear fits of the data points. 

We have stated that the simulation can be considered 
effectively ergodic when T k is indistinguishable from zero. 
In Fig. 4 we have plotted T^k and its statistical error as a 
function of log2(&) for Rc = 2.5 and 2a. For Rc = 2a the 
crossing point of Tt,K and its error is at bK ~ 16 x lO". 
For Rc = 2.5a the crossing point is at a bK > 64 x lO". 
We can conclude that for ksTh/e = 0.393 and Rc = 2a 
the simulation can be considered effectively ergodic after 
16 X lO" Met passes. 

Once a constraining radius is chosen, PT and Jw sim- 
ulations require the highest temperature Th be chosen so 
that Met is ergodic. For a given Rc, the extent of ergodic- 
ity can be tested using the same metric that has been used 
for determining the optimum value of Rc, but by varying 
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the temperature. For the parameters kBTh/e = 0.393 and 
Rc = 2(7 the simulation is ergodic even at very short se- 
quence lengths. We have found that for fc^T/i/e < 0.393 
the simulations are not ergodic. To be sure that the pa- 
rameters are appropriate, we have performed a short PT 
simulation (Iff* passes, ten PT passes consists of nine Met 
passes plus an exchange attempt) with 40 equally spaced 
temperatures in the range ksT/e =[0.028,0.393] in order 
to obtain a first estimate of the position of the melting 
and boiling temperature regions. The boiling peak in the 
specific heat appears to be located at a higher tempera- 
ture than kgT/e = 0.393. Moreover, the value of Cy at 
kBT/e — 0.393 is about one-half the value of Cy at the 
temperature of the melting peak ksTm/e — 0.282. From 
these results we feel it is safe to choose Rc — 2a and 
ksTh/s — 0.393 for the calculations that follow. 

We now illustrate the convergence characteristics of T^- 
when we increase the total time scale of the calculation by 
a factor b. We illustrate this behavior using a PT simula- 
tion of LJi3, and we focus on results at the temperature of 
the melting peak in the heat capacity (fesTm/e = 0.282). 
We choose this temperature, because from experience 
we know the statistical fluctuations are large at 
the melting heat capacity maximum. The large statistical 
fluctuations make it possible to emphasize the behavior 
of Tk- We have run the PT simulation at 40 equally 
spaced temperatures in the range ksT/e = [0.028,0.393]. 
The initial warmup time has been set to 10^ Met passes, 
followed by 2 X 10^ PT passes. Following the warm-up 
period, we perform simulations of 10^, 2 x 10^, 4 x lO'^, 
8 X 10^ 16 X 10^ and 32 x 10^ PT passes. In each case the 
initial configuration has been taken to be the last config- 
uration of the previous run. The output of the simulation 
are sequences of the potential energy. T k has been deter- 
mined in the same way as in the calculation of the high 
temperature parameters (presented in Fig. 3 and Fig. 4). 
The data points appearing in the upper panel of Fig. 5 
are the mean value over twenty independent calculations 
of the slope of l/YR:(a;„). In the lower panel of Fig. 5 
we have plotted log2 {Tx/'^bK) as a function of logj (b), 
where K = 10^ and 6 = 1, 2, 4, ... , 32. The slope of the 



the calculated exponent. 



linear fit is the exponent u, according to Eq. (52). At the 
temperature of the melting peak, v = 0.93 ± 0.03. 

It is of interest to perform a similar study of the behav- 
ior of Tk as a function of the time scaling for an Met cal- 
culation. We have taken the final configuration of the PT 
simulation at ksTm/s — 0.282 as an initial configuration, 
and we have performed a simple Met simulation at that 
melting temperature. A graph of T},k and log2 (Tj<-/Tf,x) 
as a function of logj (6) for Met is also presented in Fig. 
5. From the upper panel of Fig. 5, it is evident that Met 
results are not ergodic within the same scaled time as the 
PT results. It is also evident that the power law exponent 
for both Met and PT are not distinguishable. Similar 
studies of the power law using the Jw method also give 
the same exponent. Neither an increase in the number 
of temperatures nor changing the distribution of temper- 
atures in both Jw and PT simulations has any effect on 




Fig. 6: Comparison of tlie Met and Jw diffusion coefficients 
with ttie PT diffusion coefficient as a function of tfie reduced tem- 
perature. The dashed Une represents equivalence between methods. 



By using the results to compare the relative efficiencies 
of Met, Jw and PT simulations for the LJ13 system. We 
have found that PT and Jw simulations can be considered 
ergodic if the run length is on the order of 2 x 10^ passes, 
while Met simulations that are initialized from configu- 
rations generated from an ergodic PT study are ergodic 
when the total run length consists of 2 x 10^ passes or 
more. 

In order to compare approaches, we have calculated F 
as a function of the reduced temperature, for the three 
methods. The comparison of diffusion coefficients from 
different algorithms has also been used by Andricioaei and 
Straub [|l^. The comparison of Jw and Met with PT is 
presented in Fig. 6. The Jw and PT simulations are found 
to have comparable efhciencies using F as a measure for all 
calculated temperatures. At intermediate temperatures, 
Met is significantly less efRcient. We have chosen to trun- 
cate the Jw study at ksT/e = 0.12. For temperatures 
below kBT/e = 0.12, Jw simulations require significant ef- 
fort, because a large set of external distributions must be 
generated. Because at temperatures below ksT/e = 0.12 
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LJi3 is dominated by structures close to the lowest energy 
icosahedral isomer, we expect the Jw and PT methods to 
have similar efficiencies (as measured by F) for all tem- 
peratures. 



VI. CONCLUSIONS 

In this paper we have presented a study of the approach 
to the ergodic limit in MC simulations. In all the cases 
examined, the behavior of the MC metric dk can be ap- 
proximated by Eq. (^) , and the behavior of TbK satisfies 
Eq. (^). Because the exponent v is smaller than one for 
all the cases studied, the dependence of the non-diffusive 
contributions on dk is weaker (in the sense of Appendix 
A) than the diffusive contributions. The assumption on 
which we have built the stochastic model have been veri- 
fied numerically for a system having a sufhciently complex 
potential surface to be viewed as prototypical of a large 
set of many-particle systems. 

The MC metric used in this work appears to be a valu- 
able tool to study the ergodicity properties of MC simu- 
lations. The non-ergodic components of the MC metric 
enable the prediction of the minimum length a MC simu- 
lation must have in order to be considered ergodic. The 
comparison of T from different algorithms gives a reason- 
able estimate of their relative efficiencies. 

From the study of the melting region of 13 particle clus- 
ters, we have found that the exponent v depends both on 
the method used and the nature of the potential energy 
function. We have performed calculations, not discussed 
in this work, where the functional form of the potential 
energy is modified. These studies have shown v to be 
dependent on the details of the potential. We have not 
found the exponent u to be a strong function of method. 
Although PT and Met have significantly different efficien- 
cies as measured by their relative diffusion coefficients, v is 
nearly the same in the two methods. The difference in the 
decay of T k appears to be dominated by the coefficient 
in Eqs.(^) and (52) rather than the exponent. 

As discussed in the text, parallel tempering and J- 
walking studies of many-particle systems must have an 
initial high temperature component that is chosen so that 
a Met simulation is known to be ergodic. For cluster sim- 
ulations that require an external constraining potential to 
define the cluster, the radius of the constraining poten- 
tial must be carefully chosen in order to achieve ergodic 
results. We have found the metric and associated decay 
laws developed in this work to be a particularly valuable 
method of choosing these initial parameters in both par- 
allel tempering and J-walking simulations. 

We also remark that the metric introduced here may be 
a more sensitive probe of ergodicity than may be required 
in some applications. For example in previous J-walking 
studies |2^ of the 13-particle Lennard- Jones cluster, the 
heat capacity curve determined with a constraining radius 
of 4(7 is nearly indistinguishable from the curve obtained 
with a constraining radius of 2a. From the results of this 



work, we know the initial high temperature walk is not 
ergodic when a constraining radius of Icr is used. It is 
striking that the non-ergodicity as measured by the en- 
ergy metric is not apparent in the heat capacity curve. 

We have constructed a metric based on an ensemble of 
MC trajectories. By using an ensemble we attempt to 
cover sufficient portions of space so that all components 
are accessible. In practice only a finite subset of a full 
ensemble can be included, and it is always possible that 
components of space are missed. In such a case T x may 
decay to zero numerically within the subspace, and the be- 
havior may give misleading evidence that the simulation 
is ergodic. Because components of space may be missed in 
any finite simulation, it is impossible to guarantee ergod- 
icity. It is hoped by using a sufficiently large ensemble of 
trajectories to define the metric, the possibility of missing 
components is minimized. 
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APPENDIX A: WEAK DEPENDENCE OF THE 
NON-DIFFUSIVE CONTRIBUTIONS 

We have considered two overall time scales for a MC 
simulation. Properties calculated at short times (labeled 
k in the discrete case) provide information about each step 
of the MC process, and properties averaged over the total 
simulation time (labeled K in the discrete case) give in- 
formation about the approach to ergodic behavior. When 
K is sufficiently short we have both diffusive and non- 
diffusive contributions as a function of k. In this Ap- 
pendix we explain the relative time dependence of the 
diffusive and non-diffusive contributions to the autocorre- 
lation function. 

It has been assumed that the autocorrelation function 
Eq. (^8|) can be expressed as the sum of diffusive terms 
plus non-diffusive terms, i.e. 



K{t,t') ^ Kd{t,t')+ J2 ^^ndj{t,t') 



(Al) 



i=X+l 



where 
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Ti + r2 + . . . + Ta 



Hnd,i{t,t') ^ — jA-^ 



(A2) 
(A3) 



Increasing the time variables by a factor 6 > 1, such 
that Tx <C foi> ^ ta+i, with A > 1, we can study the 
relative variations of each contribution to the correlation 
function, diffusive and non-diffusive (labeled hy £ > A). 
In this Appendix we only consider values of b such that 
the transformation t ^ bt does not increase the time scale 
beyond the local correlation time. In Appendix B values 
of b are considered that do cross such time scales. 

By relative variations we mean 



Ad{t,t';b) 
And,e{t,t';b) 



Kdibt,bt') - Kdit,t') 



K.„d,e{bt,bt') - Knd.,l{t,t') 



(A4) 
(A5) 



Knd.,e{t,t') 

The relative variation of each non-diffusive contribution is 



And,e{t,t';b) 



1 



62 



(A6) 



whereas the relative variation of the diffusive contribution 
is 



Ad(t,i';6) = l-i , 



(A7) 



If Ad{t,t';b) > And,e{t,t';b) for all pair of times t and 
t' and for all 6 > 1 such that bty ^ Te, we say that the 



non-diffusive contributions are weaker than the diffusive 
contribution in their dependence on t. We explore, in the 
remainder of this appendix, the properties fe must have 
in order that the inequality Ad{t,t';b) > And,e{t,t';b) is 
satisfied. 



Lemma: If the function Hi{t; r) 



dt' h 



> 



V t and T 



satisfies the inequality 

He{t;T) > tfe 
then, Hf {t; r) is an increasing function of r. 



Demonstration: For £ and t fixed, the function 
Hi{t,T) evaluated in r' is 



(A8) 



(A9) 



= / dt' h ["4- 



Tt/r' 



du 



then, for At > 



Ht{t-T + AT)-Hl{t-T) 

At 



-LJ 




At 


[ ^ 


-LJ 


' At 


At 


[~ 


-LJ 


' At 


At 





-t/(r+Ar) /,, 
dt' h 







-t/(T+Ar) 



= -HtiTt/T'-T) 
T 



* ft' 
dt' h 



dt' h 







Tt/{T + AT) 



dt' fe{-]- 



dt' fi 

Tt/(r+Ar) 

^At /r 
T + At •'H T 



(AlO) 
(All) 

(A12) 
(A13) 

(A14) 
(A15) 
(A16) 



where t* e [tT/[T + AT),i]. In the limit At —f 0, and 
by virtue of the continuity of /f, the derivative takes the 
form 



dH,{t-T) 1 r (t 

-^— = -{He{t;T)-tk[- 



(A17) 



Then, dHe{t;T)/dT > 0, and Hi{t\T) is an increasing 
function of t. □ 

Here we have pres ente d the two first conditions fi must 
have, namely Eqs. (|A|) and (^. From Eq. (|l|) /^(O) 
is a global maximum, and the memory functions must 
have a positive peak at zero. The area below that peak 



must be sufficiently large to satisfy Eq. (AS). Moreover, 
fi{0) must be sufficiently large to satisfy Eq. (A9), even 
at points where fe{t/T) is a local maximum. Then, to 
satisfy this Lemma, we need a memory function with a 
sufficiently large global maximum at t — Q. 

Corollary: Suppose He{t;Ti) > tf({t/Ti). If 5 > 1, 
then < And,e{t,t';b) < 1 for all pair of times t and t' . 

Demonstration: Under the change of scale in time 
t bt, Knd,e(t,t') can be written 
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nnd,i{ht,ht') - 15777 f^'^^^fy^^ Vj' (^^TT^ ' ^^^^^ 
- I'dt, f'dt,lfj'll^) , (A19) 



bHt 

tt' Jo "Jo " n"" \ Te 
then, the quotient K„d ,e{bt,bt')/ Knd ,iit,t') is 

n^,AbtM') (^) +/o^"^^^ (^)} 

^ fodti {He{h;Te/b) + Hi{t> - hin/b)} 



(A20) 
(A21) 



By Eq. Hi{t]T) > OV i and r. By the 

Lemma the numerator is smaller than the denomina- 
tor. Then < Knd ,e(bt,bt')/ Knd,i{t,t') < 1 and then, 
< A„d,e{t,t';b) < i. □ 

Theorem: Suppose that 6 > 1 is such that T£_i ^ 
bty <C T£, Hilt; Tj) > tfiit/Ti), and all satisfy the Lip- 
schitz condition (for all closed interval A exists a real 
positive number Ce such that 



Demonstration: lfAndj{t,t';b) < Ad{t,t';b), then 

"|*l-t2 



1 1 lo'dti dt2 L 

'--b^'-v- 



|tl-t2| 



1 < 



(A23) 
(A24) 



\h{x)- fi{y)\<Ci\x-y\ (A22) 
for all X and y i\i A). Then A„d , ^(i, t']b) < Ad{t, t'; b) where the operations to reach Eq. ( |A24| ) are vaHd by using 



if and only if fi is non-negative in the interval [0,t>). 



< 



dti 



dt2 -rft 
b 



< I dti I dt2{b f, 
rti 



< 
< 
< 

< 



dh 



dti 



dto 



\tl~t2\ 

f b\ti~t2\ 
b{ti-t2) 



ti 



dti 



dti 



dt 



bti 



bf, 



dti I dt2 ft: 
JO 

\tl - t2 

fi 



Corollary. Then 

\t1-t2y 



dt fe 







ti 



t 

dtfe[- 



fi - 

Ji 



dt fe 



t>-ti 



dt-,. 



dt 



bfi 

' fi 



b{t>-tl) 



dt ft 



f b(t2-ti) 

t 

t>-ti 

dt fe 



t2-ti 







dtfA- 

t>-ti \Ti 



Using the intermediate value theorem. 



we have 



/or 
dt' fi 



Tl 



= {b-l)tfl 
^{b~l)tfl 



t*{t) 



Ti 



+ ib-l)t 



fi 



Tl 



fi 



(A25) 
(A26) 
(A27) 
(A28) 
(A29) 

(A30) 

(A31) 

(A32) 



where t*{t) e [t,bt]. Let be and t*p{t) the values at which the intermediate value theorem is satisfied, in the 

intervals [t, bt] and [ty — t, b{ty — t)] respectively 
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(6-l)(<>-i)/, 



dt' fA-] , 

t>-t VT«/ 



then, the remainder can be written as 



fi 



Re{t<,t>;b) = dt <t 



-h 



By the Lipschitz condition, we have that 



Rt{t<.ty-h) < dt {t fi 



tm-t 



-h 



+ {t> - 1) 



re 



+ {t> - t) Gi 



+ {t> - t) 



-fi 



ty - t 

Te 



ty - 1 
Te 



t*p{t) - {t> - t) 



Te 



<— dt {t \bt -t\ + (t> - t) |6(i> -t)- (t> - t)\} 

Tt Jo 

<^ib-l) f^dt [e + {ty-tf] 

Tt Jo 



<—{h-l) ( ^i^+i<i>(t>-i<) 
Ti \6 

6 Te 



where Ci is a suitable positive real constant. Using Eqs. (A32) and (A35) in Eq. (A3^ we have 

+ {h-l)Re{t<,ty;h) 



Q< jjdt{b-l) |t/, 



0< / ^dttfe (- 
Jo \Te 

0< / ^dttfe 
Jo \Te 



dttfe (-] +Re{t<,ty-b) 
t^-t^ \TeJ 



>-•'< 



dttfe (- 
\Ti 



"dttfef^^] +Re{t<,t>;b) 



2 .3 Ce 



< Feit<) + Fe{t>) - Fe{t> -t<) + -ti — {b- 1) 

o Te 



where 



Fe{t)=J^dt' 



is a continuous and difFerentiable function of t. The inequality (A45) holds for any 6 > 1. Suppose that 
Fe{ty) - Fe{ty - <<) < 0. Then, if b is such that 



6 = 1 



3 Tl 



2L tl Ce 



|F,(i<)+F,(i>)-F,(i> 



(A33) 
(A34) 

(A35) 

(A36) 
(A37) 
(A38) 
(A39) 
(A40) 
(A41) 

(A42) 
(A43) 
(A44) 
(A45) 

(A46) 

Fe{t<) + 

(A47) 



where L > 2, we have that 



< Fe{t<) + Feity) - Fe{ty - t<) + j \Fe{t<) + Fe{t>) - Fe{t> - <<)| 
< [F,(t<) + Feity) - F.ity - i<)] 



(A48) 
(A49) 



in contradiction with the hypothesis that Fe{t<) + Fe{ty) — Fe{ty — i<) is negative. Then 
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< Fe{t<) + Fe{t>) - F,{t> - t<) . (A50) 

Let us define the function 

AFiit) = Fi{t) - Ft{ty - t) , (A51) 
wliere t € (0, i>). The right derivative at t = of AFi{t) is 

AFjjAt) - AF,{0) _ ^.^ FejAt) - F,{0) + f,(t>) - f,ft> ^ At) 

At™+ At At™+ At 

1 f z"^*. . / t\ /•*> . /t ■ 



.iTo.A^ X '^"H^j^i-.f^n^'^ ^^^^^ 



If / f* \ If 



where t^; € [0, At] and t\ G [t> - At, t>]. Thus 



dAFeit) 



dt 



-.0+ 



(A55) 



If the right derivative at of AF£(t) is negative, Ai^£(t) 
approaches —Fg{ty) from below, when t — > 0. There exists 
a time < t < t>, s uch t hat > F£(t>) + AFe{i), in con- 
tradiction with Eq. (A50). Then, must be non-negative 
for t e (0,t>). By the property Eq. (|l|) /^(O) must be 
positive. This proves that And,i{t,t';b) < Ad{t,t';b) =4> 
fe{t) > for < t < t>. To demonstrate that if fi is pos- 
itive yields A„£j ^ i(t, t'; fe) < Ad{t,t';b) (i.e. the converse) , 
follow the argument backwards, from Eq. (A30). □ 

In conclusion, if the memory functions are positive, sat- 
isfy the Li psch itz condition, and satisfy the condition Eqs. 
( |A8| ) and (A9), the non-diffusive contributions are more 
weakly dependent on time than 1/t. 

The results of the present appendix are valid in the limit 
of a complete ensemble. In our numerical experiments 
only partial samples of the ensemble can be considered. 
The memory functions that appear in our numerical cal- 
culations come from partial mean values of the product of 

^ -{Gi{bt/n)Gi{bt'/n)) = ^ 



bHt 



bHt 
1 



discontinuous functions (every noise process is a discon- 
tinuous function). These memory functions are discon- 
tinuous. The behavior of the non-diffusive contributions 
observed in our numerical experiments is in agreement 
with these analytic (infinite ensemble limit) results. We 
can infer that there might be a version of the theorem 
applied to discontinuous memory functions, but we have 
been unable to develop such a theorem. 



APPENDIX B: CONSEQUENCES OF THE TIME 
SCALE CHANGE IN THE NON-DIFFUSIVE 
CONTRIBUTIONS 

In this appendix we show the behavior of the func- 
tion /i when its correlation time is changed according to 
Ti — > Tbi = Ti/b, with & 3> 1; i.e. when the total simu- 
lation time is scaled to exceed the correlation time of the 
first colored noise process. 

We multiply the time variables by a number 6, such that 
Ti ^ 6t> ^ T2. We have that the gi process contributes 
to the autocorrelation function with 



"'0 



1 



— / dt\ I dt',-f,r-^^ 
btt' Jq Jq m \ Tfci 



(Bl) 
(B2) 



where t' — t/b and Tbi = Ti/b. We want to compute this contribution both within the neighborhood ti = t2 as well as 
outside such a region. To do so, we can split the integral in Eq. (B2) in three parts 



where 



bHt 



-{Gi{bt/n) Gi{bt'/n)) =h + l2 + h 



t< ^max(0,ti-e/2) 



Jo 



Tbl \ Tfcl 



Jmax(0,ti-e/2) '''bl 



Tbl 



1 .t< .min(t>,ti+./2) ^ (\t,-h\ 

-'2 = ^ / *1 / 0.t2 — /l 



in(t>,ti+e/2) '''bl 



at2 — fi 



Tbl 



(B3) 

(B4) 
(B5) 
(B6) 
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with t< > e > (observe that the only integral involving ti — t2 is 12)- Consider Ii. If ti < e/2 the inner integral is 
zero. Therefore, ti must be bigger than e/2 and 

ii = r^*^ r^^^'dh —h (^-^) , (B7) 

OT<r> Jo ni V "^fji / 

which, by virtue of the continuity of /i, can be bounded as follows 



1^ /*'.„i(,-l);75!:^)<,,< 1 

ct> Je/2 n V 27 \ Ti J bt<_ty J^,2 Ti \ 



bt<t^ 7e/2 n V 2/ V Ti y 6t<t> 7e/2 n V 2/ V '^i 



2 i<i> Ti \ Ti / 2 t<i> Ti \ Ti 

where tmax (tmin) IS the time in the interval [e/2,<<] at which the function /i reaches its maximum (minimum) value. 
Because /i is continuous, there exists G [tmimtmax] at which 



2 i<t> ri \ Ti 

Consider now I3. If ti + e/2 > i>, the inner integral is zero. Therefore, < ii < min(i<,<> — e/2) and 



^ |.min(t<,t>-e/2) / 12 — t 



/3 = 7777 / f^^l / — /l 



^ii' Jo Jti+e/2 Tbl \ ni 

min(i<, t> — e/2) 



el 

t> - - - -mm(t<,t> - e/2) 



(BIO) 



where ^3 € [^min, ^maa], and now tmax (tmin) IS the time in [e/2,i>] at which the function /i reaches its maximum 
(minimum) value. 



fi(\h-t,\/x,,)/x, 




■(e/2-f,) 



2t, 



e/2 + t, 



Fig. 7: The area under the curve represents the first integral in Eq. (Bll). The darker piece is half of the integral in the interval 
[— tijti], the lighter is half of the integral in [—e/2, e/2]. 

Let US consider now 12- First observe that for the integral in ti, if < ti < e/2, max(0,ii — e/2) — and 
min(i>,ti + e/2) = ti + e/2. If e/2 < ti < t< then max(0,ti - e/2) ti - e/2. Then 



h = 



Jo Jo ^bl V ^bl 



t< /'inin(f:> ,ii+e/2) 

dh / 

/2 Jti-e/2 



Tbi \ ni 



(Bll) 
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The integral in t2 between and <i + e/2 can be evaluated with the help of Fig. 7 



Tbl V '^bl 



/2 1 

dt —h 
-e/2 Tbl \Tbl 



^ J-u ni 



ni 



(B12) 



The second integral in ti can be separated in two parts; the first for e/2 < ti < niin(t<. t> — e/2) and the second for 
min(t<,f> — e/2) < ii < i<. If t> — i< < e/2 the second term is zero. Then 



dti 



/2 



min(t>,ti+£/2) 
ti-e/2 



min(i< ,i> —e/2) 



dti 



/2 



e 



(I 



dii 



i>-e/2 Jti-e/2 



min(i> ,ii+e/2) -i 

dt2 — /l 

ti-e/2 Tbl 

min(i > .ii4-e/2) 

dt2 —fi 



Tbl 



Tbl 



h - to 



Tbl 



(B13) 



where 9 is the step function. If ti < min (f<, t> — e/2) then min(i>,fi +e/2) = ti +e/2. The last integral in t2 can be 
rearranged in the same way as Eq. ( pT^ ). Then 



Je/2 



min(f> ,ii4-e/2) 

di2 — /i 

ii-e/2 Tbl 
1 



|tl - t2 
Tbl 



min(t<,t>-e/2) 



dil 



/2 



e/2 



1 



e - + t< 



dti 



t>-e/2 



e/2 



-e/2 Tbl VTbl 



— /i — ) + 

-e/2 Tbl \Tbl, 



1^1 



— /i — 



Tbl 



Tbl 



(B14) 



We can observe that the correlation time Tbi goes to zero when h is in creased. The function (1/ti) fi{bt/Ti) becomes 
negligible outside a neighborhood of t = [observe Eqs. (B9) and ( B1C| )]. Equation ( [l^ ) holds, then, if b is sufficiently 
large, (1/Tbi) /i(i/Tbi) can be considered a delta function. The integrals Ii and I3 become zero, and the integrals 
involving t = in the expression of I2 converge to one. I2 becomes 



e 



} ~ 



(B15) 



which is a diffusive contribution to the autocorrelation 
function. The autocorrelation function becomes then 

.{bt, bt') = ^iL±ii +i:^fA ^) . (BI6) 

The same argument can be used when b is such that 
T2 ^ 6t> <C T3. After such changes in the time scale, the 
diffusion coefficient F = Fq + Fi is enlarged, and the non- 
diffusive contributions are reduced. There is an ultimate 
scale change, such that ta <C 6i> . Beyond this maximum 
time scale the process can be considered diffusive. 
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